Enhanced Text Retrieval Using Natural Language Processing
نویسندگان
چکیده
منابع مشابه
Information Retrieval Using Robust Natural Language Processing
We developed a fully automated Information Retrieval System which uses advanced natural language processing techniques to enhance the effectiveness of traditional key-word based document retrieval. In early experiments with the standard CACM-3204 collection of abstracts, the augmented system has displayed capabilities that made it clearly superior to the purely statistical base system. 1. O V E...
متن کاملDocument Representation in Natural Language Text Retrieval
In information retrieval, the content of a document may be represented as a collection of terms: words, stems, phrases, or other units derived or inferred from the text of the document. These terms are usually weighted to indicate their importance within the document which can then be viewed as a vector in a Ndimensional space. In this paper we demonstrate that a proper term weighting is at lea...
متن کاملRecent Developments in Natural Language Text Retrieval
This paper reports on some recent developments in our natural language text retrieval system. The system uses advanced natural language processing techniques to enhance the effectiveness of term-based document retrieval. The backbone of our system is a traditional statistical engine which builds inverted index files from pre-processed documents, and then searches and ranks the documents in resp...
متن کاملWeb Text Corpus for Natural Language Processing
Web text has been successfully used as training data for many NLP applications. While most previous work accesses web text through search engine hit counts, we created a Web Corpus by downloading web pages to create a topic-diverse collection of 10 billion words of English. We show that for context-sensitive spelling correction the Web Corpus results are better than using a search engine. For t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bulletin of the American Society for Information Science and Technology
سال: 2005
ISSN: 0095-4403
DOI: 10.1002/bult.91